{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "50e13f55",
   "metadata": {},
   "source": [
    "# 高斯玻色采样应用到稠密子图问题"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18b13be1",
   "metadata": {},
   "source": [
    "1. 图和子图\n",
    "\n",
    "数学上图 $G$ 的定义如下: \n",
    "\n",
    "$$G=(V,E)$$\n",
    "\n",
    "集合V中的元素称为节点，集合E中的元素时两个节点组成的无序对，称为边。\n",
    "集合V称为点集，E称为边集。\n",
    "在图的定义中边的概念定义了节点上的一个对称关系，即邻接关系(adjacency relation)。对于两个节点 $x$，$y$，如果 $(x,y)$ 是一条边，则称他们是邻接的，因此一张图可以用一个 $n\\times n$ 的邻接矩阵A来表示。比如对于四个节点的全连接图对应的A如下。\n",
    "$$A =\\begin{pmatrix}\n",
    "0&1&1&1\\\\\n",
    "1&0&1&1\\\\\n",
    "1&1&0&1\\\\\n",
    "1&1&1&0\n",
    "\\end{pmatrix}$$\n",
    "\n",
    "haf(A) = 3, 表示完美匹配数为3。\n",
    "\n",
    "子图对应的点集和边集分别是图G的点集的子集和边集的子集，稠密子图直观上对应着连接密集的子图，图密度的定义如下\n",
    "\n",
    "$$d(G) = \\frac{2|E|}{|V|(|V|-1)}$$\n",
    "\n",
    "$|E|$ 表示对应的边的条数，$|V|$ 表示对应的节点个数。\n",
    "那么稠密子图就对应着图密度很大的子图。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "896de506",
   "metadata": {},
   "source": [
    "2. GBS采样和稠密子图\n",
    "\n",
    "参考文献[1]中讨论了图 $G$ 的完美匹配数和图密度 $d(G)$ 的关系，hafnian的计算对应图 $G$ 的完美匹配数，那么hafnian值越大那么图的稠密度越高。\n",
    "由前面讨论可知高斯玻色采样下hafnian值越大也对应着采样概率越高，即概率越高的样本对应的子图的稠密度越大。\n",
    "在使用粒子数分辨探测器时，通过后选择对采样的样本筛选出只有0，1的结果，这些结果中出现概率较高的Fock态所映射的子图就对应了稠密子图。\n",
    "同时还可以用经典算法来寻找稠密子图，这里用到的经典算法如下，\n",
    "\n",
    "a. 选择子图规模大小 $[k_{min},k_{max}]$。\n",
    "\n",
    "b. 子图规模从大到小搜索(shrinking)，从全图开始，对于 $k>k_{min}$，每次搜索随机删除一个连接数最少的节点，剩下的节点组合成当前规模下的稠密子图。\n",
    "\n",
    "c. 子图规模从小到大搜索(growth)，对于 $k<k_{max}$，每次搜索随机添加一个连接数最多的节点，组成当前规模下的稠密子图。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ed6f8a9",
   "metadata": {},
   "source": [
    "## 代码演示"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e938b606",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-11T07:30:45.175534Z",
     "start_time": "2024-07-11T07:30:42.002944Z"
    }
   },
   "outputs": [],
   "source": [
    "import deepquantum.photonic as dqp\n",
    "import networkx as nx\n",
    "import torch\n",
    "\n",
    "from collections import defaultdict\n",
    "from strawberryfields.apps.subgraph import resize"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ca368ba",
   "metadata": {},
   "source": [
    "### 经典算法"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d84768e",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-03T07:44:18.659468Z",
     "start_time": "2024-06-03T07:44:18.655476Z"
    }
   },
   "source": [
    "这里采用下图中16个节点的图作为例子，这个图可以看作是两部分子图组成，一部分是较为稀疏的子图，对应节点为0到9，另一部分是稠密的子图，\n",
    "对应的节点为10到15。我们的目标是找到包含6个节点的稠密子图，即[10,11,12,13,14,15]组成的子图。\n",
    "\n",
    "<div style=\"margin-right: 15px; border-radius: 10px; background-color: rgb(255， 255， 255); text-align: center;\">\n",
    "    <img src=\"./fig/graph.png\" width=\"40%\"/>\n",
    "    <p style=\"padding: 10px; font-size: small; text-align: center; line-height: 0%;\">\n",
    "        <b>\n",
    "    </p>\n",
    "</div>\n",
    "\n",
    "这里的经典算法是基于贪心算法实现的，即每一次迭代寻找连接数最少的那个节点然后移除就可得到当前规模下的稠密子图， 但是如果有多个节点的连接数相同，那么它会随机选择一个节点移除，这就导致了目标稠密子图包含的节点有可能被移除，最终导致得到的结果有偏差。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "82bdfe72",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-11T07:30:46.958983Z",
     "start_time": "2024-07-11T07:30:46.947021Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([10, 11, 12, 13, 14, 15], 0.9333333333333333)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "a = dqp.utils.load_adj('densegraph_adj')\n",
    "graph = nx.from_numpy_array(a)\n",
    "s = range(16)\n",
    "r = resize(s, graph, min_size=1, max_size=15)\n",
    "r[6], nx.density(graph.subgraph(r[6]))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5b0882f",
   "metadata": {},
   "source": [
    "### 量子-经典混合算法"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "167d27af",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-03T07:58:13.348022Z",
     "start_time": "2024-06-03T07:58:13.344483Z"
    }
   },
   "source": [
    "量子-经典混合算法中先通过高斯玻色采样得到概率较高的样本然后转化成对应的子图，这些子图可以作为经典算法的搜索起点，可以有效的提高最后结果的准确度。\n",
    "\n",
    "这里先读取已有的高斯玻色采样数据，采样次数为十万次，``gbs.postselect`` 函数先挑出那些对应子图节点数为8、10的样本，然后将这些样本子图作为经典搜索算法的起点，可以得到一个最终收敛到节点为6的子图字典。\n",
    "字典中包含了节点数为6的图密度较大的多个子图， 我们取图密度最大的那个子图就是最终的结果。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "4882e3cd",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-11T07:30:48.645492Z",
     "start_time": "2024-07-11T07:30:48.625558Z"
    },
    "code_folding": [
     0
    ]
   },
   "outputs": [],
   "source": [
    "def search_subgpraph(samples: list,\n",
    "                    graph: nx.Graph,\n",
    "                    min_size: int,\n",
    "                    max_size: int):\n",
    "    \"\"\"Get the densest subgraph with size in [min_size, max_size],\n",
    "        using classical algorithm with samples from GBS\n",
    "    \"\"\"\n",
    "    dic_list = defaultdict(list)\n",
    "    for i in range(len(samples)):\n",
    "        temp= samples[i]\n",
    "        num = 1\n",
    "        for key in temp.keys():\n",
    "            if num < 50: # only need 50 samples\n",
    "                idx = torch.nonzero(torch.tensor(key)).squeeze()\n",
    "                r = resize(idx.tolist(), graph, min_size=min_size, max_size=max_size)\n",
    "                for j in range(min_size, max_size+2, 2):\n",
    "                    density = nx.density(graph.subgraph(r[j]))\n",
    "                    temp_value = (r[j], np.round(density, 5))\n",
    "                    if temp_value not in dic_list[j]:\n",
    "                        dic_list[j].append(temp_value)\n",
    "                num = num + 1\n",
    "    return dic_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "03bffcc6",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-11T07:30:49.551425Z",
     "start_time": "2024-07-11T07:30:49.154694Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "([10, 11, 12, 13, 14, 15], 0.93333)\n"
     ]
    }
   ],
   "source": [
    "# 后处理得到节点数为8和10个子图对应的样本\n",
    "sample_re = dqp.utils.load_sample('densegraph_sample')\n",
    "gbs = dqp.GBS_Graph(adj_mat=torch.tensor(a, dtype = torch.float64), cutoff=2)\n",
    "state = gbs()\n",
    "subgraph_sample = gbs.postselect(sample_re, [8, 10])\n",
    "\n",
    "# 采用shrinking 方法得到节点数为6和8的稠密子图\n",
    "dense_sub_graph = search_subgpraph(subgraph_sample, graph, min_size=6, max_size=8)\n",
    "print(dense_sub_graph[6][0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "688feacf",
   "metadata": {},
   "source": [
    "### 量子算法"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2603e22e",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-03T05:54:01.830173Z",
     "start_time": "2024-06-03T05:54:01.825146Z"
    }
   },
   "source": [
    "量子算法直接将高斯玻色采样后的6个节点对应的样本挑选出来处理，因为根据前面的讨论可以知道，对应的子图越稠密那么其样本出现的概率也就越大。\n",
    "这里先读取已有的高斯玻色采样数据，采样次数为十万次，``gbs.postselect`` 函数先挑出那些对应子图节点数为6的样本，然后``gbs.graph_density`` 函数将\n",
    "这些样本映射成子图再计算子图的图密度，最后按图密度从大到小排列给出对应的子图及其图密度。\n",
    "从最后的结果可以看到，高斯玻色采样成功采到了图密度最高的6个节点的子图，对应的图密度为0.9333，对应的节点为[10,11,12,13,14,15]。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "3beb662d",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-11T07:30:51.411404Z",
     "start_time": "2024-07-11T07:30:51.281837Z"
    }
   },
   "outputs": [],
   "source": [
    "sample_re = dqp.utils.load_sample('densegraph_sample')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "93be9a37",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-07-11T07:30:52.285790Z",
     "start_time": "2024-07-11T07:30:52.194101Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1) [tensor([144]), 0.9333333333333333]\n"
     ]
    }
   ],
   "source": [
    "subgraph_sample = gbs.postselect(sample_re, [6])\n",
    "subgraph_density = gbs.graph_density(graph, subgraph_sample[0])\n",
    "key = list(subgraph_density.keys())\n",
    "print(key[0],subgraph_density[key[0]])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bbd14fcb",
   "metadata": {},
   "source": [
    "## 附录"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "450ce075",
   "metadata": {},
   "source": [
    "[1] J. M. Arrazola and T. R. Bromley, Physical Review Letters 121, 030503 (2018)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "dq_v3",
   "language": "python",
   "name": "dq_v3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "168.929px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
